Skip to content

fix: op-tracing preflight, eval samples merge, compile device label, HTP tagged_nodes stats#209

Closed
xieofxie wants to merge 5 commits into
mainfrom
hualxie/fix_bugs
Closed

fix: op-tracing preflight, eval samples merge, compile device label, HTP tagged_nodes stats#209
xieofxie wants to merge 5 commits into
mainfrom
hualxie/fix_bugs

Conversation

@xieofxie

@xieofxie xieofxie commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Bug A (P2) wmk perf --op-tracing: moved is_qnn_profiling_available() check to before PerfBenchmark.run() so the command fails fast instead of wasting a full benchmark run
  • Bug B (P2) wmk eval --samples N: replaced wholesale DatasetConfig overwrite with a merge that preserves user-specified samples, split, shuffle, and seed when falling back to a default dataset
  • Bug C (P3) wmk compile --ep dml: derive the displayed Device: label from _EP_TO_DEVICE[provider] instead of the raw CLI --device default (npu)
  • Bug D (P3) wmk export --no-hierarchy: gate tagged_nodes and coverage_percentage stats on embed_hierarchy_attributes so they report 0 when hierarchy tagging is disabled

Each bug is covered by a new regression test that failed before the fix and passes after.

@xieofxie xieofxie requested a review from a team as a code owner April 1, 2026 06:34
Comment thread src/winml/modelkit/export/htp/exporter.py Outdated
Comment thread tests/unit/export/test_htp_exporter_stats.py
Comment thread src/winml/modelkit/eval/evaluate.py Outdated
Comment thread src/winml/modelkit/export/htp/exporter.py Outdated
- Replace fragile class-default comparison in evaluate.py with explicit_fields
  sentinel tracking; CLI --samples/--split/--shuffle now use None defaults
- Add DatasetConfig.explicit_fields frozenset to track caller-set fields
- Deduplicate tagged_nodes/coverage stats via _update_tag_stats() helper in
  HTPExporter; export() stats block now calls the shared helper
- Fix test imports: HTPExporter from package level, DatasetConfig with
  explicit_fields in Bug B regression test
Comment thread src/winml/modelkit/commands/eval.py
Comment thread src/winml/modelkit/export/htp/exporter.py

@zhenchaoni zhenchaoni left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change related to eval and evaluate looks good to me

Comment thread src/winml/modelkit/commands/eval.py

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont quite get the default value changes here, why remove default values from cli args, and delay to the function call?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the previous bugbash report in description

Bug B — P2: wmk eval --samples N silently ignored when no --dataset is given
Command: wmk eval -m microsoft/resnet-50 --samples 20
Symptom: Output always shows 'samples': 100 regardless of --samples value.
Root cause: evaluate.py:149-155 — when config.dataset.path is None (no --dataset flag), the entire DatasetConfig is replaced wholesale with the hardcoded _DEFAULT_DATASETS entry (which has samples=100), discarding the user's value:

if config.dataset.path is None:
config.dataset = deepcopy(default) # overwrites samples, split, shuffle
Fix: Merge user-specified fields (samples, split, shuffle, seed) onto the default rather than replacing it entirely.
Location: src/winml/modelkit/eval/evaluate.py:155

The None could let user decide either user value or default config value could take effect

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants